The Cooperative Study Between the Hadoop Big Data Platform and the Traditional Data Warehouse

نویسنده

  • Ping Hu
چکیده

In this paper, based on the application conditions of the existing traditional data warehouse and the future forecast of the Hadoop big data platform, this paper proposes the new framework of the cooperation of Hadoop and traditional data warehouse which focus on the cooperation between the traditional data warehouse and the Hadoop technique to solve the problem that the traditional data warehouse can hardly meet customers' demands. The new framework originated from the thoughts of the designers of Cloudera and Teradata, and in this paper, the new architecture is divided into three modules: data acquisition, data storage and data applications, this paper mainly discusses the consideration of structured and unstructured data collection, storage and application problem, and researches the Hadoop and traditional data warehouse in collaboration of data storage and data application. According to data collection and transmission problem, this paper uses the Apache Sqoop technology as the solution; and relies on Hadoop HDFS file system and the Hive data warehouse to store the data. At the same time, this paper also introduces the data application in the Hive. Finally, the prototype system proves the feasibility of the designed structure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming

The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...

متن کامل

A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection

Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....

متن کامل

2016 Olympic Games on Twitter: Sentiment Analysis of Sports Fans Tweets using Big Data Framework

Big data analytics is one of the most important subjects in computer science. Today, due to the increasing expansion of Web technology, a large amount of data is available to researchers. Extracting information from these data is one of the requirements for many organizations and business centers. In recent years, the massive amount of Twitter's social networking data has become a platform for ...

متن کامل

Etl Workflow Generation for Offloading Dormant Data from the Data Warehouse to Hadoop

The technologies developed to address the needs of Big Data have presented a vast number of beneficial opportunities for use alongside the traditional Data Warehouse (DW). There are several proposed use cases for using Apache Hadoop as a compliment to traditional DWs as a Big Data platform. One of these use cases is the offloading of "dormant data" that is, infrequently used or inactive data fr...

متن کامل

A Comparative Study of Hadoop-based Big Data Architectures

Big Data is a concept popularized in recent years to reflect the fact that organizations are confronted with large volumes of data to be processed and this, of course, presents a strong commercial and marketing challenge. This trend around the analysis and collection of Big Data has given rise to new solutions that combine traditional data warehouse technologies with Big Data systems in a logic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015